# XLSR-53 fine-tuning
Wav2vec2 Large Xlsr Deepfake Audio Classification
Apache-2.0
An audio classification model based on the wav2vec2 architecture, fine-tuned for deepfake audio detection tasks, excelling in gender recognition and fake audio detection.
Audio Classification
Transformers

W
Gustking
345
3
Wav2vec2 Large Xlsr 53 Amharic
MIT
This model is an automatic speech recognition (ASR) model fine-tuned on Amharic speech corpus based on facebook/wav2vec2-large-xlsr-53.
Speech Recognition
Transformers Other

W
agkphysics
2,539
4
Exp W2v2t It Xlsr 53 S387
Apache-2.0
An Italian automatic speech recognition model fine-tuned based on the facebook/wav2vec2-large-xlsr-53 model, trained using the Common Voice 7.0 Italian dataset.
Speech Recognition
Transformers Other

E
jonatasgrosman
18
0
Ai Light Dance Stepmania Ft Wav2vec2 Large Xlsr 53 V7
Apache-2.0
An automatic speech recognition model based on wav2vec2-large-xlsr-53, specifically optimized for StepMania game audio, fine-tuned on the GARY109/AI_LIGHT_DANCE dataset
Speech Recognition
Transformers

A
gary109
162
0
Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53 5gram V4 2
An automatic speech recognition model fine-tuned based on wav2vec2-large-xlsr-53, trained on the GARY109/AI_LIGHT_DANCE dataset
Speech Recognition
Transformers

A
gary109
68
1
Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53 5gram V3
An automatic speech recognition model fine-tuned based on wav2vec2-large-xlsr-53, specializing in singing voice recognition
Speech Recognition
Transformers

A
gary109
97
0
Ai Light Dance Stepmania Ft Wav2vec2 Large Xlsr 53 V6
Apache-2.0
This model is an automatic speech recognition (ASR) model fine-tuned on the GARY109/AI_LIGHT_DANCE - ONSET-STEPMANIA2 dataset based on wav2vec2-large-xlsr-53.
Speech Recognition
Transformers

A
gary109
160
0
Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53 5gram V4 1
This model is an automatic speech recognition (ASR) model based on the wav2vec2-large-xlsr-53 architecture, fine-tuned on the GARY109/AI_LIGHT_DANCE - ONSET-SINGING2 dataset, primarily used for singing voice recognition tasks.
Speech Recognition
Transformers

A
gary109
66
1
Ai Light Dance Stepmania Ft Wav2vec2 Large Xlsr 53 V3
Apache-2.0
Automatic speech recognition model based on wav2vec2-large-xlsr-53, fine-tuned on the GARY109/AI_LIGHT_DANCE dataset
Speech Recognition
Transformers

A
gary109
191
0
Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53 V1
Apache-2.0
This model is an automatic speech recognition model fine-tuned on the GARY109/AI_LIGHT_DANCE - ONSET-SINGING2 dataset based on wav2vec2-large-xlsr-53, primarily used for singing voice recognition tasks.
Speech Recognition
Transformers

A
gary109
185
0
Ai Light Dance Stepmania Ft Wav2vec2 Large Xlsr 53 V2
Apache-2.0
This model is an automatic speech recognition model fine-tuned on the GARY109/AI_LIGHT_DANCE dataset based on wav2vec2-large-xlsr-53
Speech Recognition
Transformers

A
gary109
166
0
Ai Light Dance Stepmania Ft Wav2vec2 Large Xlsr 53 V1
Apache-2.0
This model is an automatic speech recognition model fine-tuned from wav2vec2-large-xlsr-53 on the GARY109/AI_LIGHT_DANCE - ONSET-STEPMANIA2 dataset.
Speech Recognition
Transformers

A
gary109
48
0
Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53
Apache-2.0
This model is an automatic speech recognition model fine-tuned on the AI Light Dance dataset based on facebook/wav2vec2-large-xlsr-53.
Speech Recognition
Transformers

A
gary109
26
1
Ai Light Dance Chord Ft Wav2vec2 Large Xlsr 53
Apache-2.0
This model is a fine-tuned automatic speech recognition model based on facebook/wav2vec2-large-xlsr-53 on the GARY109/AI_Light_Dance - ONSET-CHORD2 dataset.
Speech Recognition
Transformers

A
gary109
46
0
Ai Light Dance Singing Ft Wav2vec2 Large Xlsr 53
Apache-2.0
This model is an automatic speech recognition model fine-tuned on the AI_LIGHT_DANCE - ONSET-SINGING dataset based on facebook/wav2vec2-large-xlsr-53, primarily used for singing voice recognition tasks.
Speech Recognition
Transformers

A
gary109
23
1
Wav2vec2 Large Multilang Cv Ru
Apache-2.0
This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the common_voice dataset, primarily designed for Russian speech recognition tasks.
Speech Recognition
Transformers

W
cutten
16
0
Wav2vec2 Large Xlsr 53 Tr Fine Tuning Deprecated
Apache-2.0
This model is a speech recognition model fine-tuned on the Common Voice Turkish dataset based on facebook/wav2vec2-large-xlsr-53
Speech Recognition
Transformers

W
bekirbakar
17
0
Wav2vec2 Large Xlsr 53 842h Luxembourgish 14h
MIT
A large wav2vec2.0 model fine-tuned with 842 hours of unlabeled and 14 hours of labeled Luxembourgish speech data, supporting Luxembourgish speech recognition
Speech Recognition
Transformers Other

W
Lemswasabi
204
0
FYP ARABIZI
Apache-2.0
This model is a speech recognition model fine-tuned on an unknown dataset based on facebook/wav2vec2-large-xlsr-53, supporting recognition of Arabic dialects (Arabizi).
Speech Recognition
Transformers

F
ali-issa
33
1
Bach Arb
German speech recognition model fine-tuned based on jonatasgrosman/wav2vec2-large-xlsr-53-german
Speech Recognition
Transformers

B
bkh6722
30
0
Wav2vec2 Common Voice Lithuanian
Apache-2.0
This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the COMMON_VOICE - LT dataset for Lithuanian speech recognition.
Speech Recognition
Transformers Other

W
birgermoell
17
0
Ft Pt Br Local
Apache-2.0
A fine-tuned Portuguese automatic speech recognition model based on jonatasgrosman/wav2vec2-large-xlsr-53-portuguese
Speech Recognition
Transformers

F
tonyalves
31
1
Wav2vec2 Common Voice Ab Demo
Apache-2.0
A speech recognition model fine-tuned on the COMMON_VOICE - AB dataset based on facebook/wav2vec2-large-xlsr-53
Speech Recognition
Transformers Other

W
patrickvonplaten
18
0
Wav2vec2 Common Voice Tr Demo
Apache-2.0
This model is an automatic speech recognition (ASR) model fine-tuned on the COMMON_VOICE SV-SE dataset based on facebook/wav2vec2-large-xlsr-53, supporting Swedish speech recognition.
Speech Recognition
Transformers

W
birgermoell
17
0
Wav2vec2 Large Xlsr 53 W2V2 TATAR SMALL
Apache-2.0
This model is a Tatar automatic speech recognition model fine-tuned on the Common Voice 8 dataset based on facebook/wav2vec2-large-xlsr-53, with a test set WER of 53.16%.
Speech Recognition
Transformers Other

W
emre
30
1
Wav2vec2 Large Xlsr 53 German
Apache-2.0
This is a fine-tuned XLSR-53 large model for German speech recognition tasks, based on Facebook's wav2vec2-large-xlsr-53 model and fine-tuned on the Common Voice 6.1 German dataset.
Speech Recognition German
W
jonatasgrosman
8,266
7
Wav2vec2 Large Xlsr 53 Finnish
Apache-2.0
Finnish speech recognition model fine-tuned based on XLSR-53 large model, supports 16kHz audio input
Speech Recognition Other
W
jonatasgrosman
73.11k
1
Wav2vec2 Xlsr Khmer
Apache-2.0
A Khmer speech recognition model fine-tuned on the facebook/wav2vec2-large-xlsr-53 model, achieving a WER of 24.96% on the OpenSLR Khmer dataset.
Speech Recognition Other
W
gagan3012
172
1
Wav2vec2 Large Xlsr 53 French
Apache-2.0
This is a French speech recognition model fine-tuned from the XLSR-53 large model, trained on the Common Voice dataset, supporting high-accuracy French speech-to-text conversion.
Speech Recognition French
W
jonatasgrosman
47.83k
11
Wav2vec2 Large Xlsr 53 Euskera
Apache-2.0
A speech recognition model fine-tuned on the Basque language (Euskera) using the Common Voice dataset, based on the facebook/wav2vec2-large-xlsr-53 model.
Speech Recognition Other
W
mrm8488
28
0
Wav2vec2 Xlsr 53 Rm Vallader With Lm
Apache-2.0
Romansh Vallader dialect speech recognition model based on wav2vec2-xlsr-53 with language model support
Speech Recognition
Transformers

W
anuragshas
16
0
Fb Vindata Vi Large
Apache-2.0
This model is a Vietnamese automatic speech recognition model fine-tuned on the PHONGDTD/VINDATAVLSP - NA dataset based on facebook/wav2vec2-large-xlsr-53
Speech Recognition
Transformers

F
phongdtd
29
0
Wav2vec2 Large Xlsr Kn
Apache-2.0
This is an automatic speech recognition (ASR) model fine-tuned on Kannada language using Facebook's wav2vec2-large-xlsr-53 model, trained with the OpenSLR SLR79 dataset.
Speech Recognition Other
W
amoghsgopadi
2,200
1
Wav2vec2 Large Xlsr 53 Estonian
Apache-2.0
Estonian speech recognition model fine-tuned from Facebook's XLSR-53 large model, achieving 30.74% word error rate on Common Voice dataset
Speech Recognition Other
W
anton-l
3,259
0
Wav2vec2 Xlsr 53 Tamil
Apache-2.0
A Tamil speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53, trained on the Common Voice Tamil dataset.
Speech Recognition Other
W
anuragshas
64
0
Wav2vec2 Large Xlsr 53 Spanish
Apache-2.0
A Spanish speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53, trained on the Common Voice 6.1 Spanish dataset
Speech Recognition Spanish
W
jonatasgrosman
46.28k
30
Wav2vec2 Large Xlsr 53 Greek
Apache-2.0
This is a large XLSR-53 model fine-tuned for Greek speech recognition tasks, based on facebook/wav2vec2-large-xlsr-53 model, trained using Common Voice 6.1 and CSS10 datasets.
Speech Recognition Other
W
jonatasgrosman
130.81k
1
Wav2vec2 Large Xlsr 53 Dutch
Apache-2.0
A Dutch speech recognition model fine-tuned based on facebook/wav2vec2-large-xlsr-53, trained on the Common Voice and CSS10 datasets, supporting 16kHz audio input.
Speech Recognition Other
W
jonatasgrosman
3.0M
12
Wav2vec2 Large Xlsr 53 Sah CV8
Apache-2.0
A speech recognition model fine-tuned on the Common Voice Yakut dataset based on facebook/wav2vec2-large-xlsr-53
Speech Recognition
Transformers Other

W
emre
19
0
Wav2vec2 Large Xlsr 53 Sakha
Apache-2.0
Yakut speech recognition model fine-tuned from XLSR-53 large model, with 32.23% word error rate
Speech Recognition Other
W
anton-l
25
0
- 1
- 2
Featured Recommended AI Models